- ¿Qué frena las estrategias de IA de los CIO? Su propia curva de aprendizaje
- HNA cierra su plan de transformación para garantizar la competitividad futura de la organización
- 75% of organizations view social unrest as the greatest risk
- Le tecnologie di frontiera del 2025: le previsioni degli analisti e la visione dei CIO
- ITDM 2025 전망 | 금융 플랫폼 성패, 지속가능한 사업 가치 창출에 달렸다” KB국민카드 이호준 그룹장
I used Copilot AI Vision to browse the web for me, and it has big potential
Decades ago, if you wanted information on anything, you would go to the library and open a book. That changed with the emergence of the web and search engines, where now all you have to do is type in a search query and get all the information you could ever want.
As if that wasn’t easy enough, artificial intelligence (AI) is here to make information gathering even more hands-off.
Earlier this month, Microsoft launched Copilot Vision, an experience in which Copilot can view and understand the context of what you’re doing online to provide verbal real-time assistance. The idea is that when you need feedback or advice while browsing, you can tap into a live assistant for help.
Also: 3 lucrative side hustles you can start right now with OpenAI’s Sora video generator
The experience lives in Microsoft Edge and is available in preview for a select group of Copilot Pro subscribers with the $20-per-month subscription through Copilot Labs and US Copilot Pro subscribers on Windows. I got early access and put it to the test. Is the subscription worth it to access Copilot Vision? Here’s my experience.
An assistant that understands and sees all
In theory, searching the web is pretty self-explanatory, so getting help with it may seem superfluous. However, when I went through the onboarding demo experience, I got pretty excited, as the applications seemed genuinely useful.
Example use cases included having several pictures on my screen and asking Copilot Vision to help sort through them. In one instance, there were many pictures of various dog breeds, and I was prompted to ask Copilot Vision to tell me more about them verbally. The assistant reviewed each picture and told me more about each breed — despite there being no text on the screen.
In another example, there were pictures of different cities on the screen, again with no text, and I was prompted to ask which one was the oldest. Copilot Vision identified each city and explained which was the oldest and why.
Also: How to use Microsoft’s Copilot AI on Linux
In my favorite example, the AI tool took a sample article and summarized it for me when I asked. I can see this being a powerful tool for research, especially if you’re looking for information on one specific thing and don’t want to skim the piece yourself to see if it has what you need. Now, in theory, you can just ask.
In all of the demos, Copilot Vision’s ability to understand me was very impressive; it understood me regardless of how quickly I spoke or if I mumbled, which was a major plus because it made the experience smooth and intuitive. However, when it was time to start using it on my own websites, I was slightly disappointed while exploring the real-life applications.
The limitations
Right now, Copilot Vision can access a limited number of sites, including Wikipedia, Tripadvisor, Amazon, Target, OpenTable, Wayfair, Food & Wine, Williams Sonoma, and Geoguessr.
The majority of these are shopping websites, and my experience was that there isn’t much meaningful assistance that can happen when doing online shopping. Ways it was able to help me include navigating the shopping sites, guiding me to specific sections, such as deals, and figuring out what tabs to click and what sections to go to.
For example, on Amazon, when I asked if I could help my mom find something to get for Christmas, it suggested which tabs on the site I should click on to find items that would interest her. The Vision portion wasn’t especially helpful because I could see the tabs on the site myself. It then offered me generic product suggestions, such as a book or sweater.
When I clicked on a random tab, I asked it to give me feedback on what the best gift for her would be from the options shown. It picked the first item on the screen, which in this case was an Amazon Fire HD tablet, listing its obvious and topline use cases — again, not very helpful.
On the three content sites available — Wikipedia, Tripadvisor, and Food & Wine — Copilot Vision showed more promise because it was able to summarize the articles’ contents, which seems like a major productivity win for workers, students, and others.
On Food & Wine, which has a more traditional homepage with its most trending articles displayed, the AI was also helpful in giving a rundown of what I was looking at, briefly explaining the top story and other featured articles.
However, I don’t use Wikipedia for my research because of its third-party entries, and the two other sites are very niche in their focus, so I’m not sure how helpful Copilot Vision will be unless you already happen to be on one of these sites.
Also: You can interview the AI hosts of your NotebookLM podcast now
The last two serve more unique use cases: OpenTable and GeoGuessr. On OpenTable — a site used to browse for restaurants and book reservations — it wasn’t very helpful because, again, it can only assist with what you’re looking at. For example, if you ask for recommendations for a Mexican restaurant that night, it will simply tell you what is already visible.
Exploring GeoGuessr is where Copilot Vision was the most helpful, acting as an assistant who knew all the answers. Like having a very informed partner on your team, the AI gave me some helpful tips — which is cheating, in a way.
Security concerns
Naturally, with having an AI model looking at your screen comes concerns about it looking at your data. To address these, Microsoft has a robust Q+A that answers people’s most burning questions.
For starters, the company reassures users that Copilot Vision only views their Edge window in an active Vision session, designated by a browser frame with a hue indicating they are in an active session. This is the major differentiator from Recall, a feature that takes snapshots of a user’s screen in the background at all times when users on Copilot+ PCs opt in — and has undergone much controversy.
Also: The best AI chatbots
According to the company, Copilot’s responses are only logged to monitor unsafe interactions, but user inputs, including text, images, and contexts, are never stored. Furthermore, a user’s input data is deleted once a session ends.
Is it worth it?
Right now, getting Copilot Pro just for Copilot Vision would not be worth it, especially because access is not guaranteed. However, with the $20-per-month subscription, users also gain access to other perks, such as priority access to the latest models and Copilot in select Microsoft 365 apps, including Word, Excel, PowerPoint, OneNote, and Outlook. This makes it a fun bonus to tinker with if you’re already a Copilot power user and could benefit from these other perks.